{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# NumPy: 固定类型的Python数组"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">### 相关参考资料\n",
    "bilibili视频可参考：  \n",
    "(1)[【python教程】数据分析——numpy、pandas、matplotlib](https://www.bilibili.com/video/BV1hx411d7jb?from=search&seid=12190247343680651243)  \n",
    "(2)[莫烦 Python 数据处理教程](https://www.bilibili.com/video/BV1Ex411L7oT?from=search&seid=12190247343680651243)  \n",
    "相关书籍可参考:  \n",
    "《Python数据科学手册》"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**NumPy**，便是**Numerical Python**的简称。意为：数字的Python  \n",
    "在某些方面，**NumPy数组**与Python内置的**列表类型**非常相似。但随着数组在维度上变大，Numpy数组提供了更高效的存储和数据操作。**NumPy数组**是**Python数据科学工具生态系统**的**核心**。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. 基本概述"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Python所采用的是**动态数据类型**，如下段代码："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = 4\n",
    "x = 'four'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里已经将变量**x**从**整型**变成了**字符串类型**，这便是Python所特有的**灵活性**，但这也会使得类型的结构体中含有大量的额外信息。如下面的**列表**类型："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "list1 = [True, '2', 3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在这个**Python的列表类型**中，为保证**灵活性**，每一项必须包含各自类型信息，也就是说，**列表中的每一项都是一个完整的Python对象**。  而与此相对应，固定类型的**NumPy式数组缺乏这种灵活性，但能更有效地存储和操作数据**。  总而言之：**NumPy数组**包含**同一类型**的值，而**Python列表**中元素**类型可以不一致**。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在使用NumPy模块之前，需进行导入，惯例一般将其名称设置为**np**。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. 创建NumPy数组"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1 使用np.array从列表进行创建"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "首先，可以用**np.array**从**Python列表**创建数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NumPy数组为:  [1 2 3 4 5]\n",
      "Python列表为 [1, 2, 3, 4, 5]\n"
     ]
    }
   ],
   "source": [
    "array1 = np.array([1, 2, 3, 4, 5])\n",
    "print('NumPy数组为: ',array1)\n",
    "print('Python列表为', [1, 2, 3, 4, 5])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " ***注意***  \n",
    " *1*.NumPy数组与Python列表的形式不同，**NumPy数组**无逗号，用**空格**进行分隔；而**Python列表**则使用**逗号**进行分隔。  \n",
    " *2*.NumPy数组必须包含同一类型的数据，如果类型不匹配，NumPy会**向上转换**，如："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[3.14 1.   2.  ]\n"
     ]
    }
   ],
   "source": [
    "array2 = np.array([3.14, 1, 2])\n",
    "print(array2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.2 dtype关键字进行数组数据类型的设置"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "若希望明确设置**数组的数据类型**，可以用**dtype关键字**："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[3 1 2]\n"
     ]
    }
   ],
   "source": [
    "array3 = np.array([3.14, 1, 2], dtype = 'int32')\n",
    "print(array3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " ***注意***  \n",
    " *1*.dtype后的格式若为**int、float**，则**可不加引号**；而若为**int32、float32等**，则必须**加引号**。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.3 创建多维列表"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "不同于Python列表，NumPy数组可以被指定为**多维**的。如下利用嵌套列表初始化多维数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[2 3 4]\n",
      " [4 5 6]\n",
      " [6 7 8]]\n"
     ]
    }
   ],
   "source": [
    "array4 = np.array([range(i, i+3) for i in [2, 4, 6]])\n",
    "print(array4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.4 使用NumPy内置方法创建"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在面对**大型数组**的时候，用**Numpy内置的方法**从头创建数组是一种更高效的方法。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "（***1***）**np.zeros**：值全为 ***0*** 的数组"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "array5 = [0 0 0 0 0 0 0 0 0 0]\n",
      "array6 = [[0 0 0 0 0]\n",
      " [0 0 0 0 0]\n",
      " [0 0 0 0 0]]\n"
     ]
    }
   ],
   "source": [
    "array5 = np.zeros(10, dtype = int) #一维数组\n",
    "array6 = np.zeros((3, 5), dtype = int) #二维数组\n",
    "print('array5 =',array5)\n",
    "print('array6 =',array6)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "（***2***）**np.ones**：值全为 ***1***  的数组"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "array7 = [1 1 1 1 1 1 1 1 1 1]\n",
      "array8 = [[1 1 1 1 1]\n",
      " [1 1 1 1 1]\n",
      " [1 1 1 1 1]]\n"
     ]
    }
   ],
   "source": [
    "array7 = np.ones(10, dtype = int) #一维数组\n",
    "array8 = np.ones((3, 5), dtype = int) #二维数组\n",
    "print('array7 =',array7)\n",
    "print('array8 =',array8)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "（***3***）**np.full**：值全为 ***设置值***  的数组"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "array9 = [3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14]\n",
      "array10 = [[3.2 3.2 3.2 3.2 3.2]\n",
      " [3.2 3.2 3.2 3.2 3.2]\n",
      " [3.2 3.2 3.2 3.2 3.2]]\n"
     ]
    }
   ],
   "source": [
    "array9 = np.full(10, 3.14) #一维数组\n",
    "array10 = np.full((3, 5), 3.2) #二维数组\n",
    "print('array9 =',array9)\n",
    "print('array10 =',array10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "（***4***）**np.arange**：线性序列数组，如**range()**类似"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "array11 = [ 0  2  4  6  8 10 12 14 16 18]\n"
     ]
    }
   ],
   "source": [
    "array11 = np.arange(0, 20, 2)\n",
    "print('array11 =',array11)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "（***5***）**np.linspace**：均匀分配序列数组 （**闭区间**）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "array12 = [0.   0.25 0.5  0.75 1.  ]\n"
     ]
    }
   ],
   "source": [
    "array12 = np.linspace(0, 1, 5)\n",
    "print('array12 =',array12)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "（***6***）**np.random.random**：0~1均匀分布的随机数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "array13 = [[0.351325   0.21198856 0.33438375]\n",
      " [0.34795518 0.94751665 0.53576009]\n",
      " [0.90172469 0.8647817  0.15762703]]\n"
     ]
    }
   ],
   "source": [
    "array13 = np.random.random((3,3))\n",
    "print('array13 =',array13)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "（***7***）**np.random.normal**：均值、标准差为设置值的**正态分布随机数**数组"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "array14 = [[ 0.78272478  0.53343682  2.28611418]\n",
      " [ 0.7060398  -1.03404058 -0.4452683 ]\n",
      " [ 1.62295592  0.45360629  1.43592776]]\n"
     ]
    }
   ],
   "source": [
    "array14 = np.random.normal(0, 1, (3,3))\n",
    "print('array14 =',array14)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "（***8***）**np.random.randint**：区间为设置值内的**随机整型**数组"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "array15 = [[4 5 0]\n",
      " [8 6 1]\n",
      " [7 4 3]]\n"
     ]
    }
   ],
   "source": [
    "array15 = np.random.randint(0, 10, (3,3))\n",
    "print('array15 =',array15)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "（***9***）**np.eye**：生成设置大小的**单位矩阵**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "array16 = [[1. 0. 0.]\n",
      " [0. 1. 0.]\n",
      " [0. 0. 1.]]\n"
     ]
    }
   ],
   "source": [
    "array16 = np.eye(3)\n",
    "print('array16 =',array16)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "（***10***）**np.empty**：生成**未初始化数组**，值为**任意值**。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "array17 = [1. 1. 1.]\n"
     ]
    }
   ],
   "source": [
    "array17 = np.empty(3)\n",
    "print('array17 =',array17)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": true,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "261.812px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
